AITopics | task and motion planning

Collaborating Authors

task and motion planning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making Manling Li

Neural Information Processing SystemsFeb-17-2026, 16:08:16 GMT

We aim to evaluate Large Language Models (LLMs) for embodied decision making.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Republic of Türkiye (0.04)
North America > United States > Indiana (0.04)
Asia > Japan > Shikoku > Kagawa Prefecture > Takamatsu (0.04)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Overview (0.67)

Industry:

Law (0.92)
Leisure & Entertainment (0.67)
Health & Medicine (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)

Add feedback

High-Performance Dual-Arm Task and Motion Planning for Tabletop Rearrangement

Zhang, Duo, Huang, Junshan, Yu, Jingjin

arXiv.org Artificial IntelligenceDec-10-2025

Abstract-- We propose Synchronous Dual-Arm Rearrangement Planner (SDAR), a task and motion planning (T AMP) framework for tabletop rearrangement, where two robot arms equipped with 2-finger grippers must work together in close proximity to rearrange objects whose start and goal configurations are strongly entangled. T o tackle such challenges, SDAR tightly knit together its dependency-driven task planner (SDAR-T) and synchronous dual-arm motion planner (SDAR-M), to intelligently sift through a large number of possible task and motion plans. Specifically, SDAR-T applies a simple yet effective strategy to decompose the global object dependency graph induced by the rearrangement task, to produce more optimal dual-arm task plans than solutions derived from optimal task plans for a single arm. Leveraging state-of-the-art GPU SIMD-based motion planning tools, SDAR-M employs a layered motion planning strategy to sift through many task plans for the best synchronous dual-arm motion plan while ensuring high levels of success rate. Comprehensive evaluation demonstrates that SDAR delivers a 100% success rate in solving complex, non-monotone, long-horizon tabletop rearrangement tasks with solution quality far exceeding the previous state-of-the-art. Experiments on two UR-5e arms further confirm SDAR directly and reliably transfers to robot hardware. Task and motion planning (T AMP) [1] represents a fundamental computation challenge in robotics, in which a robot system, e.g., one or more robot arms, must break down a given, potentially long-horizon task into suitable "bite-sized" sub-tasks that can be executed through short-horizon robot motions.

artificial intelligence, configuration, motion planning, (16 more...)

arXiv.org Artificial Intelligence

2512.08206

Country:

North America > United States > New Jersey (0.28)
Europe (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)

Add feedback

Building Gradient by Gradient: Decentralised Energy Functions for Bimanual Robot Assembly

Mitchell, Alexander L., Watson, Joe, Posner, Ingmar

arXiv.org Artificial IntelligenceDec-9-2025

Abstract-- There are many challenges in bimanual assembly, including high-level sequencing, multi-robot coordination, and low-level, contact-rich operations such as component mating. T ask and motion planning (T AMP) methods, while effective in this domain, may be prohibitively slow to converge when adapting to disturbances that require new task sequencing and optimisation. These events are common during tight-tolerance assembly, where difficult-to-model dynamics such as friction or deformation require rapid replanning and reat-tempts. Moreover, defining explicit task sequences for assembly can be cumbersome, limiting flexibility when task replanning is required. T o simplify this planning, we introduce BGBG, a decentralised gradient-based framework that uses a piecewise continuous energy function through the automatic composition of adaptive potential functions. This approach generates sub-goals using only myopic optimisation, rather than long-horizon planning. It demonstrates effectiveness at solving long-horizon tasks due to the structure and adaptivity of the energy function. We show that our approach scales to physical bimanual assembly tasks for constructing tight-tolerance assemblies. In these experiments, we discover that our gradient-based rapid replanning framework generates automatic retries, coordinated motions and autonomous handovers in an emergent fashion. Bimanual assembly is an inherently sequential planning problem that demands reasoning over tasks and motions. The challenge is further amplified in contact-rich settings or when collaborating with humans, making efficient and robust planning essential for reliable execution.

artificial intelligence, assembly, energy function, (18 more...)

arXiv.org Artificial Intelligence

2510.04696

Genre: Research Report (0.82)

Industry: Energy (0.82)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)

Add feedback

LLM-GROP: Visually Grounded Robot Task and Motion Planning with Large Language Models

Zhang, Xiaohan, Ding, Yan, Hayamizu, Yohei, Altaweel, Zainab, Zhu, Yifeng, Zhu, Yuke, Stone, Peter, Paxton, Chris, Zhang, Shiqi

arXiv.org Artificial IntelligenceNov-12-2025

Task planning and motion planning are two of the most important problems in robotics, where task planning methods help robots achieve high-level goals and motion planning methods maintain low-level feasibility. Task and motion planning (TAMP) methods interleave the two processes of task planning and motion planning to ensure goal achievement and motion feasibility. Within the TAMP context, we are concerned with the mobile manipulation (MoMa) of multiple objects, where it is necessary to interleave actions for navigation and manipulation. In particular, we aim to compute where and how each object should be placed given underspecified goals, such as ``set up dinner table with a fork, knife and plate.'' We leverage the rich common sense knowledge from large language models (LLMs), e.g., about how tableware is organized, to facilitate both task-level and motion-level planning. In addition, we use computer vision methods to learn a strategy for selecting base positions to facilitate MoMa behaviors, where the base position corresponds to the robot's ``footprint'' and orientation in its operating space. Altogether, this article provides a principled TAMP framework for MoMa tasks that accounts for common sense about object rearrangement and is adaptive to novel situations that include many objects that need to be moved. We performed quantitative experiments in both real-world settings and simulated environments. We evaluated the success rate and efficiency in completing long-horizon object rearrangement tasks. While the robot completed 84.4\% real-world object rearrangement trials, subjective human evaluations indicated that the robot's performance is still lower than experienced human waiters.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1177/02783649251378196

2511.07727

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

ScheduleStream: Temporal Planning with Samplers for GPU-Accelerated Multi-Arm Task and Motion Planning & Scheduling

Garrett, Caelan, Ramos, Fabio

arXiv.org Artificial IntelligenceNov-10-2025

Bimanual and humanoid robots are appealing because of their human-like ability to leverage multiple arms to efficiently complete tasks. However, controlling multiple arms at once is computationally challenging due to the growth in the hybrid discrete-continuous action space. Task and Motion Planning (TAMP) algorithms can efficiently plan in hybrid spaces but generally produce plans, where only one arm is moving at a time, rather than schedules that allow for parallel arm motion. In order to extend TAMP to produce schedules, we present ScheduleStream, the first general-purpose framework for planning & scheduling with sampling operations. ScheduleStream models temporal dynamics using hybrid durative actions, which can be started asynchronously and persist for a duration that's a function of their parameters. We propose domain-independent algorithms that solve ScheduleStream problems without any application-specific mechanisms. We apply ScheduleStream to Task and Motion Planning & Scheduling (TAMPAS), where we use GPU acceleration within samplers to expedite planning. We compare ScheduleStream algorithms to several ablations in simulation and find that they produce more efficient solutions. We demonstrate ScheduleStream on several real-world bimanual robot tasks at https://schedulestream.github.io.

algorithm, artificial intelligence, planning & scheduling, (16 more...)

arXiv.org Artificial Intelligence

2511.04758

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Grounded Vision-Language Interpreter for Integrated Task and Motion Planning

Siburian, Jeremy, Shirai, Keisuke, Beltran-Hernandez, Cristian C., Hamaya, Masashi, Görner, Michael, Hashimoto, Atsushi

arXiv.org Artificial IntelligenceNov-5-2025

While recent advances in vision-language models have accelerated the development of language-guided robot planners, their black-box nature often lacks safety guarantees and interpretability crucial for real-world deployment. Conversely, classical symbolic planners offer rigorous safety verification but require significant expert knowledge for setup. To bridge the current gap, this paper proposes ViLaIn-TAMP, a hybrid planning framework for enabling verifiable, interpretable, and autonomous robot behaviors. ViLaIn-TAMP comprises three main components: (1) a Vision-Language Interpreter (ViLaIn) adapted from previous work that converts multimodal inputs into structured problem specifications, (2) a modular Task and Motion Planning (TAMP) system that grounds these specifications in actionable trajectory sequences through symbolic and geometric constraint reasoning, and (3) a corrective planning (CP) module which receives concrete feedback on failed solution attempts and feed them with constraints back to ViLaIn to refine the specification. We design challenging manipulation tasks in a cooking domain and evaluate our framework. Experimental results demonstrate that ViLaIn-TAMP outperforms a VLM-as-a-planner baseline by 18% in mean success rate, and that adding the CP module boosts mean success rate by 32%.

artificial intelligence, machine learning, vilain-t amp, (17 more...)

arXiv.org Artificial Intelligence

2506.0327

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

SLAP: Shortcut Learning for Abstract Planning

Liu, Y. Isabel, Li, Bowen, Eysenbach, Benjamin, Silver, Tom

arXiv.org Artificial IntelligenceNov-4-2025

Long-horizon decision-making with sparse rewards and continuous states and actions remains a fundamental challenge in AI and robotics. Task and motion planning (TAMP) is a model-based framework that addresses this challenge by planning hierarchically with abstract actions (options). These options are manually defined, limiting the agent to behaviors that we as human engineers know how to program (pick, place, move). In this work, we propose Shortcut Learning for Abstract Planning (SLAP), a method that leverages existing TAMP options to automatically discover new ones. Our key idea is to use model-free reinforcement learning (RL) to learn shortcuts in the abstract planning graph induced by the existing options in TAMP. Without any additional assumptions or inputs, shortcut learning leads to shorter solutions than pure planning, and higher task success rates than flat and hierarchical RL. Qualitatively, SLAP discovers dynamic physical improvisations (e.g., slap, wiggle, wipe) that differ significantly from the manually-defined ones. In experiments in four simulated robotic environments, we show that SLAP solves and generalizes to a wide range of tasks, reducing overall plan lengths by over 50% and consistently outperforming planning and RL baselines.

artificial intelligence, machine learning, shortcut, (15 more...)

arXiv.org Artificial Intelligence

2511.01107

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.89)

Add feedback

Kinodynamic Task and Motion Planning using VLM-guided and Interleaved Sampling

Kwon, Minseo, Kim, Young J.

arXiv.org Artificial IntelligenceOct-31-2025

Abstract-- T ask and Motion Planning (T AMP) integrates high-level task planning with low-level motion feasibility, but existing methods are costly in long-horizon problems due to excessive motion sampling. While LLMs provide commonsense priors, they lack 3D spatial reasoning and cannot ensure geometric or dynamic feasibility. We propose a kinodynamic T AMP framework based on a hybrid state tree that uniformly represents symbolic and numeric states during planning, enabling task and motion decisions to be jointly decided. Kinodynamic constraints embedded in the T AMP problem are verified by an off-the-shelf motion planner and physics simulator, and a VLM guides exploring a T AMP solution and backtracks the search based on visual rendering of the states. I. INTRODUCTION Robotic manipulation tasks, such as tabletop manipulations, require reasoning over both symbolic task decisions and continuous geometric feasibility. A robot must decide which action to perform--such as picking, placing, or stacking-- and which object to grasp, which constitutes a discrete search process. Simultaneously, it must determine grasp poses, feasible end-effector configurations, and collision-free motion trajectories governed by continuous constraints. This class of problems is studied under the framework of Task and Motion Planning (i.e., T AMP), which combines high-level task planning with continuous action parameter binding and low-level motion planning [1], [2].

artificial intelligence, constraint, task and motion planning, (14 more...)

arXiv.org Artificial Intelligence

2510.26139

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)

Add feedback

Learning to Plan & Schedule with Reinforcement-Learned Bimanual Robot Skills

Wan, Weikang, Ramos, Fabio, Yang, Xuning, Garrett, Caelan

arXiv.org Artificial IntelligenceOct-30-2025

Long-horizon contact-rich bimanual manipulation presents a significant challenge, requiring complex coordination involving a mixture of parallel execution and sequential collaboration between arms. In this paper, we introduce a hierarchical framework that frames this challenge as an integrated skill planning & scheduling problem, going beyond purely sequential decision-making to support simultaneous skill invocation. Our approach is built upon a library of single-arm and bimanual primitive skills, each trained using Reinforcement Learning (RL) in GPU-accelerated simulation. We then train a Transformer-based planner on a dataset of skill compositions to act as a high-level scheduler, simultaneously predicting the discrete schedule of skills as well as their continuous parameters. We demonstrate that our method achieves higher success rates on complex, contact-rich tasks than end-to-end RL approaches and produces more efficient, coordinated behaviors than traditional sequential-only planners.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2510.25634

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

Using VLM Reasoning to Constrain Task and Motion Planning

Yan, Muyang, Mengdibayev, Miras, Floros, Ardon, Guo, Weihang, Kavraki, Lydia E., Kingston, Zachary

arXiv.org Artificial IntelligenceOct-30-2025

In task and motion planning, high-level task planning is done over an abstraction of the world to enable efficient search in long-horizon robotics problems. However, the feasibility of these task-level plans relies on the downward refinability of the abstraction into continuous motion. When a domain's refinability is poor, task-level plans that appear valid may ultimately fail during motion planning, requiring replanning and resulting in slower overall performance. Prior works mitigate this by encoding refinement issues as constraints to prune infeasible task plans. However, these approaches only add constraints upon refinement failure, expending significant search effort on infeasible branches. We propose VIZ-COAST, a method of leveraging the common-sense spatial reasoning of large pretrained Vision-Language Models to identify issues with downward refinement a priori, bypassing the need to fix these failures during planning. Experiments on two challenging TAMP domains show that our approach is able to extract plausible constraints from images and domain descriptions, drastically reducing planning times and, in some cases, eliminating downward refinement failures altogether, generalizing to a diverse range of instances from the broader domain.

artificial intelligence, constraint, refinement failure, (15 more...)

arXiv.org Artificial Intelligence

2510.25548

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)

Add feedback